Density-Dependent Analysis of Nonequilibrium Paths Improves Free Energy 

Estimates 
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When a system is driven out of equilibrium by a time-dependent protocol that modifies the Hamil- 
tonian, it follows a nonequilibrium path. Samples of these paths can be used in nonequilibrium work 
theorems to estimate equilibrium quantities, such as free energy differences. Here, we consider an- 
alyzing paths generated with one protocol using another one. It is posited that analysis protocols 
which minimize the lag, the difference between the nonequilibrium and the instantaneous equilib- 
rium densities, will reduce the dissipation of reprocessed trajectories and lead to better free energy 
estimates. Indeed, when minimal lag analysis protocols based on exactly soluble propagators or 
relative entropies are applied to several test cases, substantial gains in the accuracy and precision 
of estimated free energy differences are observed. 



I. INTRODUCTION 

The accurate and efficient estimation of free energy 
differences is an important goal in chemical physics and 
remains an active area of research. One promising ap- 
proach to free energy estimation entails measuring the 
work done on a system over repetitions of an irreversible 
process. According to the second law of thermodynamics, 
the mean work is greater than the free energy difference 
between the end states of the process, Fa. Nonequilib- 
rium work theorems^^^ supplement this upper bound 
by rigorously equating F\ with other averaged functions 
of the work. These theorems have been empirically vali- 
dated in single-molecule pulling experiments^ and com- 
puter simulations (e.g. Ref£). 

Jarzynski's equality; 1 ^ a unidirectional nonequilibrium 
work theorem, relates the free energy difference to an 
exponential average of the work. Unfortunately, be- 
cause it uses a nonlinear (specifically, a logarithmic) func- 
tion of the average, the free energy estimator based on 
this equality suffers from a systematic finite-sampling 
biase d ! 11 While accurate in the limit of infinite sam- 
pling, this estimator is usually dominated by rare events 
where the work is less than the free energy difference, 
and thereby converges slowlyJ^ 

If the average amount of work dissipated as heat is 
reduced, these low-work events will be more frequent 
and accurate free energy estimation will usually require 
fewer work samples. The most straightforward way to 
reduce heat dissipation is to slow the rate of the pro- 
cess; in the limit of infinitely slow switching, the pro- 
cess is reversible and the work is equal to the free en- 
ergy difference. Unfortunately, reducing the switch- 
ing rate requires additional time and lowers the signal- 
to-noise ratio in single- molecule pulling experiments. 13 
Under the constraint of constant experiment length, 
it is possible to reduce heat dissipation by optimizing 
the switching protocol that controls how the thermo- 
dynamic state changes with time. Protocol variation 
predates Jarzynski's equality, having been applied to 
tightening free energy bounds from the second law of 



thermodynamics! 14 i 15 ' 16 i 17 ] 18 More recently, variational 
calculus has been applied to find optimal protocols that 
minimize the mean workj 19 ' 20 i 21 

While protocol variation is, in principle, feasible in 
laboratory experiments, many more approaches to im- 
proving nonequilibrium-based free energy estimation are 
possible in computer simulations. For example, Wu 
and Kofke were inspired by the Rosenbluth chain sam- 
pling scheme to develop methods for generating low-work 
nonequilibrium paths^ Vaikuntanathan and Jarzyn- 
ski took another approach, altering the system dynam- 
ics, to reduce heat dissipation and improve free energy 
estimates^ The approach most mathematically similar 
to this work, however, is importance sampling in nonequi- 
librium path space 

In importance sampling, samples from one distribu- 
tion are used to estimate expectations in another. The 
technique is often applied to Markov chain Monte Carlo 
and molecular dynamics simulations (where it is usu- 
ally called umbrella sampling): after applying a config- 
urational bias to overcome energy barriers and promote 
ergodicity, expectations are calculated for the unbiased 
ensemble. Importance sampling has been extended to 
transition path samplin g 29 ' 30 with nonequilibrium tra- 
jectories. In this algorithm, a biasing function modifies 
the Monte Carlo acceptance criteria of proposed paths 
in a way that improves the convergence of free energy 
estimates, 24 ! 25 ! 26 ! 27 ! 28 

Here, we apply the importance sampling formalism in 
a completely different way. Instead of sampling nonequi- 
librium trajectories in a biased manner, we focus on the 
analysis of previously generated paths. Instead of asking 
which path-ensemble we would like to sample from, we 
ask which path-ensemble average we would like to evalu- 
ate. This is accomplished by processing paths generated 
using one protocol - the sampling protocol, A s - using 
another - the analysis protocol, A. 

While we have infinite freedom in selecting an analysis 
protocol, not all choices will improve the convergence of 
free energy estimates. One reasonable strategy for choos- 
ing A is to minimize the lag, the difference between the 
nonequilibrium and instantaneous equilibrium densities; 
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FIG. 1: Lag in a moving harmonic oscillator: Potential en- 
ergy (solid line), U(x,t), and density (dashed line), p(x,t), as 
a function of position, at (a) t = and (b) t — 0.1, where 
v = 10. Sampling protocol (solid line), A s , and mean po- 
sition, xr(t), as a function of time, for (c) v = 10 and (d) 
v = 15.8019. For all parts of this figure, D = 1 and k = 25. 



Vaikuntanathan and Jarzynski found that under certain 
dynamics, dissipation is eliminated if there is no lag, lead- 
ing to a zero- variance estimator of Fy ~ To reduce the 
lag, they modified their equation of motion with an ad- 
ditional flow-field term that "escorts" the system along 
a near-equilibrium path. Essentially, this strategy mod- 
ifies the nonequilibrium density. In this paper, we take 
the opposite approach: using the analysis protocol to 
choose an instantaneous equilibrium density that closely 
matches the sampled nonequilibrium density. 

As an illustrative case, consider a Brownian particle 
in a harmonic oscillator, or spring, which moves at a 
constant velocity (Fig. [I}. If the system starts in ther- 
mal equilibrium, its density is a Gaussian about the 
initial spring position. When the spring starts mov- 
ing, the density remains a Gaussian with the same vari- 
ance, but its mean position, ccr(i), lags behind the spring 



position 
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For this particular system, an analysis pro- 



tocol based on xx(t) will have no lag. We shall further 
explore this system in Section HTT1 

One complication with using an analysis protocol that 
minimizes the lag is that its end state is usually not the 
same as in the sampling protocol. Thus, the free energy 
difference being estimated differs. To estimate the same 
F\ with a minimal lag analysis protocol, it may be nec- 
essary to extend or otherwise modify the sampling. To 
distinguish the two situations, we shall refer to the for- 
mer as protocol postprocessing and the latter as nonequi- 
librium density-dependent sampling (NEDDS). Both fall 
under the aegis of density-dependent analysis. 

The structure of this paper is as follows: in Section 
HT1 the importance sampling form of Jarzynski's equality 
is detailed; in Section HTT1 density-dependent analysis is 
demonstrated on two cases in which the propagator is 
analytically known; in Section IIV1 a general method for 
finding minimal lag analysis protocols is described, ap- 



plied to an adaptive algorithm for NEDDS, and tested on 
the model system; and lastly, implications of this method 
and possible future directions are discussed. 



II. FREE ENERGY FORMALISM 

Consider a system whose Hamiltonian, H — H(x;X), 
depends on a;, its position in phase space (or config- 
uration space), and a control parameter, A. Initially, 
the system is prepared in thermal equilibrium at A(0). 
The parameter A is perturbed according to a protocol 
A = A(i) until it reaches a final state at A(r). Jarznyski's 
equality^ 
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(1) 



relates the free energy difference between the initial and 
final states of the protocol, Fa, to an average over all pos- 
sible paths, X — x(t), resulting from this nonequilibrium 
procedure. Specifically, this expectation (denoted by the 
angled brackets (...) A ), is a path integral over infinites- 
imal elements dX with the protocol-dependent density 
Pa [A]. During each process, the work done on the sys- 
tem is W[X\A] = /J" dtX(dH/dX). (In this paper, all 
energies will be expressed in units of ksT.) 

Suppose that instead of pa[A], we consider an alter- 
nate density of paths, p s [A]. The free energy difference 
Fa can be calculated by applying a reweighed form of 
Jarzynski's equality,— 



-Fa 



-W[X|A] 



,(2) 



where r = p\[X]/p s [X] is the ratio of probabilities of 
observing the trajectory given the densities. To analyze 
a finite sample of paths drawn from p s [A] , we replace the 
expectations with sample mean estimators, obtaining, 26 
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(3) 



where N s is the sample size. In a standard Jarzynski 
estimate, r = 1. 

Previous workers have improved the convergence prop- 
erties of Eq. |3]) by choosing p s [X] to be various work- 
weighted functionals of the original density pa [A] .^ i 27 ? 28 
When introducing the single-ensemble biased path sam- 
pling approach, Ytreberg and Zuckerman picked p s [A] = 
PA [X}e- w ^/ 2 , such that r = e w W A V 2 & In a pa- 
per comparing the method with conventional equilib- 
rium procedures, Oberhofcr et. al. considered p s [A] = 
Pa[X]/P(W[X\A])£L By variation of the asymptotic 
variance with respect to the sampling bias, Oberhofer 
and Dellago found that optimal work-weighted sampling 
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is given by p s [X] = p A [x] \ e -(w[x\A]-F A ) _ 1 | j 28 Unfortu- 
nately, this optimal choice is impractical because it in- 
cludes the sought quantity Fa. 

In the present method, which applies Eq. ([3]) in a novel 
manner, p s [X] ~ p As [X] depends on the sampling pro- 
tocol and r differs from unity when the analysis protocol 
A varies from A s . Notably, the relevant work is VF[A|A], 
not VF[A|A S ], meaning that different choices of A will re- 
sult in various work distributions and convergence prop- 
erties. This new way of applying importance sampling 
leads to different, albeit analogous, asymptotic variance 
expressions. — The present approach is more general than 
previous applications of Eq. (J3J) , which require transition 
path sampling, because it does not require biased sam- 
pling and paths can be generated by ordinary dynamical 
equations. Indeed, under certain assumptions, such as 
those suggested by Nummela and Andricioaei,— it should 
be possible to apply the present method to laboratory ex- 
periments. 

We note, as a caveat, that the importance sam- 
pling form of Jarzynski's equality will only be useful for 
stochastic dynamics where r can be computed. Under 
deterministic dynamics, r is a delta function and having 
different sampling and analysis protocols will not improve 
free energy estimates. 



III. CASES WITH AN ANALYTICAL 
PROPAGATOR 

As mentioned earlier, we would like to choose an anal- 
ysis protocol that minimizes the lag, such that the in- 
stantaneous equilibrium density corresponds with the 
sampled nonequilibrium density. This is particularly 
tractable when the propagator is exactly known. Here, 
we demonstrate Eq. ([3]) on two such cases: a Brownian 
particle in a harmonic oscillator (i) moving at a constant 
velocity or (ii) with a time-dependent natural frequency. 
With both, the potential energy has the general form 
U(x) = k{x — x) 2 /2 and the nonequilibrium density is 



Pneq (^5 ^) 
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(4) 



where xr(i) and /ct(£) are the most typical paths and 
spring coefficients, respectively. As these propagators 
can be obtained by close analogy to the path integral 
derivation of work-weighted propagators^ their deriva- 
tions are not detailed here. In case (i), k is constant and 
A moves the spring position according to x — A s = vt 1 
such that AF = 0, fer(i) = k, and 



-Dkt 



)■ 



(5) 



In case (ii), x is zero and A controls the spring coeffi- 
cient, k = A S! such that AF = \ ln[fc(0)/fc(r)]. In the 



corresponding nonequilibrium density, xx{t) = 0, and 
k{Q)e 2D K ds k ^ 



k T (t) 



1 + 2£>fc(0) f* du e 2D Jo" ds fe ( s ) 
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Based on these expressions, it is evident that, for case 
(i), the minimal lag analysis protocol is A m i = xrit) 
from Eq. ([5]), and for case (ii), it is A m i = kr{t) from Eq. 
([6]). In these special cases, the nonequilibrium density 
is exactly the equilibrium density corresponding to A m / 
and there is no lag. 

To test whether density-dependent analysis leads to 
improved free energy estimates, one-dimensional Brown- 
ian dynamics simulations were run with the equation of 
motion, 



x j+ i 
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(7) 



where Xj is the position at time jAt, At is the time 
step, and Rj is a standard normal random variable. 
The primes denote spatial derivatives such that Uj = 
dU(xj;Xj)/dxj and U'! — d 2 U(xj] Xj)/dx 2 . For a dis- 
crete trajectory X — {xo, x±, Xj} sampled with the 
protocol A s = {Ao, Ai, Aj}, where J is the total num- 
ber of steps, the probability ratio is r = e~ AS , where 
AS = S[X\A] - S[X\A S ] and S[X\A] is the stochastic 
action (discretized from Ref.— ), 



S[X\A] 



U(xj;Xj) + U(x ;X ) 



At" 
AD ^ 

3=0 L 

W[X\A] 



At 



Xj+1 Xj i ' (du;) 2 - 2D 2 u'; 



(8) 



The work was evaluted with the discrete formula, 
W[X\A\ = EfoVfe+i; A,+i) - U(x j+1 ;X j )}. This ac- 
tion is valid in the continuum limit, as J — > oo and 
At — > 0. To approach this limit, we chose D = 1 and 
a time step of At = 0.001. 

The simulations were performed over 10 m steps (trun- 
cated to be an integer) , where m refers to 7 evenly spaced 
numbers between 1.5 and 3. In case (i), k was set to 25 
and A s was chosen to start from Ao = and linearly 
progress to the target state at A/ = 1. With case (ii), 
A s is a linear interpolation between 1 and 100. After- 
wards, the trajectories both analyzed with the standard 
Jarzynski estimate and subjected to protocol postpro- 
cessing with A m ;. 

For comparison, NEDDS was implemented by switch- 
ing A at a faster rate such that the final state went beyond 
A / and the final nonequilibrium density, according to the 
propagators, corresponded to the target state. This is 
illustrated in Fig. [TJi, where moving the harmonic os- 
cillator at a faster rate than in Fig. [TJ: allows for the 
final density to correspond to the equilibrium state with 
A = 1. These trajectories, which took the same amount 
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FIG. 2: Representative work- weight plot for a moving har- 
monic oscillator: W[X|A] and r of 50 paths with v — 10, 
analyzed with A = A s (squares) or A = Xxit) (circles). The 
free energy difference (shaded line) and Fa from Eq. J3]) using 
A = A s (solid line) and A = xr(t) (dashed line) are denoted 
by horizontal lines. 
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Switching time, x 

FIG. 3: Comparison of free energy estimates for a moving 
harmonic oscillator: Mean and standard deviation of 10000 
Fa estimates using 50 trajectories each, analyzed with A = A s 
(squares), A = A m i (circles), or by NEDDS (triangles). The 
latter two are slightly offset to prevent error bar overlap. 



of simulation time for the same number of steps, were 
then reanalyzed with the appropriate A m i- 

In case (i), we find that protocol postprocessing with 
A m i leads to a desirable result: most work values are re- 
duced such that a larger fraction of them are less than 
the free energy difference (Fig. [5]). Of these negative dis- 
sipation trajectories, most have a probability ratio less 
than one. Conversely, several positive dissipation trajec- 
tories have a probability ratio greater than one. For this 
set of trajectories, the modified work distribution leads 
to a more accurate free energy estimate. 

Over a large number of repetitions and range of switch- 
ing speeds, we find that free energy estimates based on 
A = A m i are vastly improved over the standard pro- 
cedure, A = A s , having significantly less variance and 
systematic bias (Fig. [3]). The standard estimator only 
approaches the accuracy and precision of protocol pro- 
cessing for slow switches. Clearly, these trajectories are 
much better at estimating the end state free energy dif- 
ferences for Ami than for A s . The estimates of Fa from 
NEDDS also require considerably less sampling than the 
standard procedure, although the effect is somewhat less 
dramatic. 

Similarly, in case (ii), density-dependent methods also 
show improvement over the standard Jarzynski estimate. 
For the time-dependent natural frequency, the system- 
atic bias of the standard estimate is relatively small 
but nonetheless evident at all sampled switching rates 
(Fig. 2]). Estimates from both density-dependent meth- 
ods have reduced bias and variance, and are found to be 
of similar quality to each other. 



IV. GENERAL CASE 

In most practical situations, unfortunately, the prop- 
agator is not known ahead of time. Thus, prior to sim- 
ulations, it is unclear how long paths need to be gener- 
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Switching time, x 

FIG. 4: Comparison of free energy estimates for a harmonic 
oscillator with a time- dependent natural frequency: Mean and 
standard deviation of 10000 Fa — Fa estimates using 50 tra- 
jectories each, analyzed with A = A s (squares), A = kr{t) 
(circles), or by NEDDS (triangles). The latter two are slightly 
offset to prevent error bar overlap. 



ated before the noncquilibrium density matches a density 
characteristic of the target state. While paths are being 
generated, however, it is possible to estimate the differ- 
ence between the sampled density and arbitrary equilib- 
rium states. States which minimize this difference can be 
be collected in an analysis protocol with minimal lag. 

One measure of the distance between two probabil- 
ity distributions is the Kullback-Leibler divergence, or 
the relative entropy. The relative entropy between 
the nonequilibrium density and an arbitrary equilibrium 
state T is, 

D KL (p ne q(x,t)\\pr(x)) = [ dx Pneq(x, t) In ^filfli. .(9) 

J Pt{x) 

When the integral is separated into two at the loga- 
rithm, one part is a constant with respect to T. The 
divergence is minimized by finding a state where the 
other, — J dx p n eq{ x ,t) lnpr(x), is the least. Using sam- 
pled discrete paths, this integral can be estimated by 



^2n=i^ n PT( x jn), where Xj n is the position at step j 
of path n. For a state T, the equilibrium density is 
Pt(x) = exp [— (Ht(x) — Ft(x))], where Ht(x) is the 
test state Hamiltonian and Ft is its free energy. Thus, 
the relative entropy is minimized by the smallest value 
of, 



DT(xjl,Xj2,--,XjN s ) = 



1 



^2 H T(Xjn) 



-F T , (10) 



among different states T. Generally, the free energy, Ft, 
is unknown, but for states which occur along the switch- 
ing protocol, Ft — Fq (where Fq is the free energy at 
Ao) can be estimated using the standard form of Jarzyn- 
ski's equality. These states constitute our search space 
for minimizing the lag. 

Suppose we are interested in the free energy difference 
between the states defined by A and A/. We can use D T 
to estimate A m ; on the fly and determine when to stop 
sampling via the following adaptive algorithm: 

1. Start with j = and the work Wo = 0. For each of 
N s paths, obtain xq by drawing samples from the 
equilibrium ensemble at Ao- 

2. Propagate each path, calculating a^+i using a dy- 
namical equation such as Eq. [7] To obtain Wj+i, 
calculate the work done on the system during the 
time step and add it to Wj. The next step in the 
sampling protocol, Aj + i, is found by adding a pre- 
determined value, fi, to A 3 . The sign of /i must be 
the same as A/ — Ao . Increment j by one. 

3. Using Wj values in the standard form of Jarzynski's 
equality, estimate Fj — Fo, the free energy difference 
between the states with Xj and Ao. 

4. For each state T corresponding to {Ao, Ai, ... Aj}, 
use Ht(x) and the free energy difference estimated 
in the previous algorithm step to calculate Dt — Fq. 
The A which minimizes Dt — Fq is X m i- Add X m i 
to the minimal lag protocol A m j. 

5. If X m i hasn't crossed A/, repeat from algorithm step 
2. Otherwise, set the final value in K m i to A/. 

6. Estimate the free energy difference using Eq. ([3]) 
with A = A m i. 

This algorithm was tested on Sun's system, 2 ^ where 
the potential energy is U(x) — x A — 16Ax 2 . Using Eq. 
Q , the free energy difference was estimated between the 
initial state with Ao = 0, where the potential is a single 
well, and the target state A/ = 1, where it is a double 
well, such that AF = —62.9407^ Brownian dynamics 
simulations were performed with the same diffusion co- 
efficient, time step, and equation of motion as in Section 
Mil The increment of A at each time step was /i = vAt, 
where v — 10™ and m refers to 9 evenly spaced values 
between and 2. For comparison, the standard Jarzynski 
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FIG. 5: Representative divergence landscape for Sun's sys- 
tem: Contour plot of Dt as a function of sampling A, es- 
timated using 50 paths with v = 10. A m ; is shown with a 
dashed line. Note that only half of this information, where 
the sampling A is less than the test A, is available on-the-fly. 
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FIG. 6: Comparison of free energy estimates for Sun's system: 
Mean and standard deviation of 10000 Fa estimates using 50 
trajectories each, analyzed with A = A s (squares) or A = A m ; 
(circles, slightly offset to prevent error bar overlap) 



estimate was applied to simulations where A is switched 
between and 1 at a slower velocity, taking the same 
total time as in the corresponding NEDDS simulations. 

In a representative set of simulations, the density most 
noticeably lags behind the sampling state at the begin- 
ning (Fig. [5]). Around the state defined by A = 0.9, the 
lag quickly diminishes. However, the minima of Dt does 
not reach the target state until the sampling A is beyond 
1. 

Based on many repetitions of this procedure at differ- 
ent pulling speeds, we find that our NEDDS algorithm 
converges much more quickly than the standard Jarzyn- 
ski estimate (Fig. ^ . The systematic bias is largely elim- 
inated with simulations that are switched nearly an or- 
der of magnitude faster. At the fastest switching rates, 
NEDDS remains biased but still outperforms the stan- 
dard Jarzynski estimate. With these fast switchings, it 
is possible that the nonequilibrium density does not cor- 
respond well to any traversed equilibrium state. 
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V. DISCUSSION AND CONCLUSION 

With the goal of minimizing the lag via the choice of 
analysis protocol, we have developed density-dependent 
methods to analyze nonequilbrium paths, to estimate 
which states may constitute a protocol that minimizes 
the lag, and to adaptively sample paths until the desired 
density is achieved. Our promising results validate the 
strategy and provide further evidence for the link be- 
tween lag and heat dissipation. They also hint that the 
accurate estimation of free energy differences may require 
adequate sampling in the important regions of both end 
states. 

Analysis protocols provide another degree of freedom 
for lag reduction, and can be used in conjunction with 
other methods, such as sampling protocol optimization 
or biased path sampling. Furthermore, their use should 
extend beyond Jarzynski's equality; they can poten- 



tially be applied in bidirectional nonequilibrium work 
expressions^ or any relationship between a nonequilib- 
rium process and a state function, such as Hummer and 
Szabo's expression for the potential of mean forced 
Quite possibly, our results are just the tip of an iceberg 
and this paper will open up new research directions for 
sampling and analyzing nonequilibrium trajectories. 
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In these appendices, we derive asymptotic variance and bias expressions for free energies estimated using protocol 
postprocessing. Our derivations are similar to that of Obcrhofer et. al. for biased sampling of nonequilibrium 
trajectories . 27 ' 28 Although these expressions are not directly relevant to the main text, they support the line of 
inquiry explored in the paper and may be useful to those who wish to expand upon the results. Both unidirectional 
and bidirectional expressions are considered. 



APPENDIX A: UNIDIRECTIONAL 

Expressed in importance sampling form, Jarzynski's equality is, 

/ e -W[X\A]\ 

e = " T~\ —i ( A1 ) 



This is Eq. (3) in the main text, reproduced here for convenience. 
Towards deriving the asymptotic expressions, we first define, 

A = r e- w ™ (A2) 

B = r (A3) 
The sample mean estimators for the expectations of A and B are, 

1 Ns 

A = j^X An (A4) 

s n—1 



B = j^J2 Bn ( A5 ) 



n=l 



These form an estimator for the free energy, 



F A = -ln(A/B) (A6) 

However, the sample mean estimators deviate from their true expectations, 

A = (A) - AA (A7) 
B = (B) - AB (A8) 

Here, subscripts on the angle brackets are omitted for notational simplicity. 

Deviations in these expectations lead to variance and bias in F\. The magnitude of the error can be estimated by 
a Taylor series expansion about F\, which becomes an increasingly reasonable approximation in the large sampling, 
or asymptotic, limit. To the first order, this expansion is, 

F A = -mj^-f (A9) 
(B) — AB { ' 



OF 



F A + HrT ) AA + ) AB ( Al °) 



OA J \dB J 

(AA AB\ 

- Fa ~\M)-Jb)) 

The partial derivatives are evaluated at their mean values. 

The variance is defined as ct 2 [-Fa] = ((-Fa — Fa) 2 )- Using the first-order Taylor series expansion, this is, 



(AH) 



( AA AB 



2 \ 



^"Uw-wJ/ (A12) 

I AA 2 AB 2 AAAB \ ,. <0 , 

= {w + w -ms>) (A13) 

(AA 2 ) (AB 2 ) (AAAB) 

= ^aT + ^bT~ 2 ^W (AU) 

" (Af + (Bf (A) (B) (A15) 
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The variance of a sample mean is the variance of the variable over the number of samples. Since A and B are 
functions of the same data points, the covariance of their sample means is, similarly, a 2 [A, B] — j^a 2 [A, B]. Thus, 
the asymptotic variance is, 



1 

N 
1 

N 



(A 2 ) (B 2 ) 2 (AB) 



(Ay 



(B) 



(A) (B) 



(A 2 ) e 2F A + (B 2 ) - 2 (AB) e?* 



(B) 



1 ^ r 2 e -2(W[X\A]-F A ) +r 2 _ 2r 2 e ~(W[X\A]-F A )\ l 

^ Ws ' 

1 / r 2( e -(W[X|A]-F A ) _ !)2\ 



(A16) 
(A17) 
(A18) 
(A19) 



The bias is defined as Bn = (-Fa) — F\. If we approximate this error with a first-order Taylor series expansion, it 
is always zero. In order to obtain a nonzero bias expression, we use a second-order Taylor series expansion about Fa, 



F A 



In 



(A) - AA 

(B) - AB 



F A+ d J±AA + d JiAB 



dA 



dB 



2 dA 2 dAdB 



AAAB 



1 d 2 F A A - 2 



2 dB'- 



-AB Z 



(AA AB 1 AA 2 1 AB 2 

~ Fa ~ { (I) ~ JB) 2jAf + *W, 

Using this approximation, the bias is found to be, 



B 



N 



1 

2 

1 

27V 

1 

2N 



(AA 2 ) (AB 2 ) 

(A) 2 (B) 2 
(A 2 ), 



,2F A 



(B 2 ) 



(By (b) 

x (^,2 ^,-2(W[X\K]-F K ) _ ^ 



2N„ 



(A20) 
(A21) 
(A22) 

(A23) 
(A24) 
(A25) 
(A26) 



Notably, when the dissipated work is zero, W[X| A] — Fa = 0, expressions for both the variance and bias are likewise 
zero. 



APPENDIX B: BIDIRECTIONAL 



Here, we consider the possibility of reanalyzing bidirectional data, collected using both a protocol A and its time 
reversal A. The results derived in this section suggest that for bidirectional data, the optimal analysis protocol is 
actually the sampling protocol. Thus, protocol postprocessing is less promising when applied to bidirectional data 
than to unidirectional data. 

For notational consistency, we start our discussion with the Crooks Fluctuation Theorem,—^ 



Pa[X] 



,W[X\A]-F A 



(Bl) 



As in the main text, pa[A] is the probability of observing trajectory X, given the protocol A. Analogously, is 
the probability of observing the time reversal, or conjugate twin, of X , using the reverse protocol A. 
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This theorem can be used to derive a relationship between forward and reverse path-ensemble averages^ 



(T A [X]) A = JdX T A [X]p A [X] (B2) 
= fdX T A [X}e w[xlA] - FA p A [X] (B3) 



= (T A [X]e- w ™- F -) (B4) 

In the above, T A [X] is an arbitrary functional. As the last path-ensemble average is over trajectories X, trajectories 
sampled from A must be reversed prior to being evaluated with the functional. 

Next, we rearrange the path-ensemble average theorem into an expression for the free energy^ 

e~ F - = - ^ a W)a (B5) 

^ A [X]e- w ^Wj . 

As in the unidirectional case, these path-ensemble averages can be written as reweighed samples from other sampling 
densities. 

e"" = ^ r (B6) 

\ r )s (rT A [X]e-w[x\h]\ 

(^m) s 8 (B7) 



[rF A [X]e- w l x W 
The probability ratio f is dehned similarly to r, 

~„_P A [X] 



(B8) 



ps[X] 

Using Eq. (|B1[) . we can show that it is related to r by 

r = f e W[x\A]-w[x\A,] (B9) 

The ratio of (r) s /(r) s can be shown to be unity by converting (f) s into a forward path-ensemble average using 
Eqs. (|B4[) and (|B9[) . and applying the importance sampling form of Jarzynski's equality, Eq. (|Aip . 

The asymptotic variance of Eq. (|BT|) can be calculated by a similar procedure to the unidirectional case. We start 
by defining, 

C = rJ- A [X] (BIO) 
D = ?T A [X]e- w ™ (Bll) 

Replacing A and B with C and D, we follow the same logic as in the unidirectional case from Eqs. (| A6|l to (|A14p . 
Next, we note that C and D are independent samples drawn from different ensembles and their correlation is zero. 
Thus, the variance estimate is, 



= + ~h^2 -(— + —) ( B12 ) 



(C 2 ) (D 2 ) (11 
N S (C} 2 N S (D) 2 \N S N s 

{r 2 nm) s (r 2 nme- 2W[m ). s fl i 



-> 2 

/ 5 



N. Nz 



(B13) 



N s (rT A [X}) s N _ (rF A [X]e- w \ x \^ 
We would like to combine the two terms including C and D in a single path-ensemble average. In order to do that 



we need to convert the ensemble averages containing D to the forward direction. For (D), this 



(D) 



dXf^ A [X]e- w ^W p s [X) 

dX re w[x\A s ]-w[x\A] Th[x]e w[x\A] Ps[X ] e - w ^ A ^ 



dX rT A [X]p s [X]e 



f a 



{rf A [X} 



,F A 



For (D 2 ), this is 



f 2 Tl[X]e- 2W ^\ 

I S 

dX f 2 F 2 [X}e~ 2W ^ p s [X] 

dX r 2 e 2(W[X\A s ]-W[X\A]) j^jqfWlXW p& [X]e -W[X\A a ]+F A 

J dX r 2 Tl[X]e w W^p s [X]e F - 
r 2 T 2 [X]e w ^ A A ^ 



Using Eqs. (fB13|) . (|B1~8|) . and (fB23|) . we obtain, 



a 2 [Fa] = 



J_ 4. J_ P W[X\A„]-F A 



i i 



Now if we choose the functional^, 



Fa[X] = 



_L i J_ e W[X\A s ]~F A 

N. Ns 



j;-l e W[X\\]-W[X\A.] 



1 1 



~W[X\A B ]-F A 



then we obtain a generalized form of the Bennett Acceptance Ratio, as derived by Crooks. 5 
Variational optimization of Eq. (|B24p . however, leads to the functionals, 



1 , 1 c w\x\a s ]-f a 



?a[X] = r- 2 
F K [X] = f.-^ e nw[x\A]-w[x\A 3 \) 



_L i J_ e -W[X\A S ]-F A 

N, Ns 
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Substituting these into Eq. (|B7[) leads to 



.[i + i 



,W[X|A S 1-F A 



e W[X|A]-2W[X|A s ] 
?[Tk + Tk e ~ W[ * lkB] ~ FA \ 

N, 



r 1+e 



M + W[X|A S ]-F A 



^ , e W[X\k]-W{X\k s ] e ~WlX\A s ]-F A 

1 



Fa 



r |l + e M + W[X|A a ]-F A I 



e W[X|A]-W[X|A s ] 
f\ e -(M-W[X\K 3 ]-F K ) +1 } 



s_ e M-F A 



(B29) 



(B30) 



(B31) 



where M = In ^ . 

This expression is analogous to Bennett's original expression. It can be solved self-consistently or by rearrangement 
into, 



N s 



1 



r [l + e M+W[X\A s ]-F A j 



N s 



,W[X\A]-W[X\As 



I 4. e -(M-W[X\A,]-F A ) 



= 



(B32) 



Using the sample mean estimator for the expectations, this is, 

N * - A p W[X } \A]-W[X } \A s ] 



T [1 + e M+W[X n |A s ]-F A ] ^ . ^ + e _ (M _ w[ x 3 .|A s ]-F A ) 



1 



E 



(B33) 



which is similar to the expression of Shirts et. al. for the Bennett Acceptance Ratic^i. This is an implicit function 
of Fa which is solved by finding the zero of the equation. 

The variance of this expression can be found by plugging the optimal functionals into Eq. (IB24[) . 



AFa] 



J_ _|_ J_ p W[X\A,]-F A 



J_ 4. J- P W[X\A a ]-F A 



1 1 

N~ S + N~ S 



(B34) 



We would like to express this equation in a form in which it is clear how to include data sampled from forward and 
reverse path-ensembles. The path-ensemble average in the numerator is, 



_L + 1 W\X\A S ]-F A 

N» AT? 



1 -1 ' 



dX 



dX 



N sPs [X] 



" 2 [l + e M+W[X\A s ]-F A j 

N s p s [X] + N sPs [X] 



r 2 [2 + 2cosh(M + W[A"|A a ] - Fa)] 
We obtain Eq. (|B36j) from Eq. (|B35|) by multiplying it by 

1 + e -(M+W[X\A s ]-F A ) 
1 4. e -(M+W[X\A s ]-F A ) ■ 

Splitting the integral into two, we obtain a form amenable to treating forward and reverse switching data, 

N sPs [X] , f djt N sPs [X]e 2 ( w ^- w ^^ 



dX 



^ 2 [2 + 2cosh(Af + TU[A|A S ] - F A )] 



P 2 + 2 cosh(M -W[X\A S ]-F A ) 



(B35) 
(B36) 

(B37) 
(B38) 
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which can be estimated with, 



E 



[2 + 2 cosh(M + W[X n \K s ] - F A )] 



E- 



2(W[^|A]-W[Jf 3 -|A»]) 



j=i f 2 [2 + 2 cosh(M - W[Xj\A a 



Fa) 



By analogous procedure, the path-ensemble average in the denominator of Eq. (|B34[) is, 



E 



1 



Et 



J r [2 + 2 cosh(M + VF[A„|A S ] - F A )] ^ f [2 + 2 cosh(M - W[Xj |A S ] - F A ) 
Finally, we obtain the variance estimator of the bidirectional free energy calculation, 



E 



N, 



n=l r2[2+2co S h(Jl/+W[X„| As ]-F A )] ^ ^'=1 f 3 [2+2 cosh(M-W[X s \A,]-F A )] 



E 



2(lV[X J -|A]-W[X j |A s ]) 



,n=l r[2+2cosh(M+W[X n \A s ]-F A )] " 1 " ^=1 f[2+2 cosh(A/-W[X,- |A 3 ]-F A )J 



-f^ + ^) 113411 



1 

iV. 



(B39) 



(B40) 



As with unidirectional data, a natural question to ask is how to find the optimal analysis protocol when processing 
bidirectional data. It no longer makes sense to reduce the lag; a protocol which reduces the lag for forward trajectories 
may increase it for trajectories from the reverse protocol. Furthermore, when Eq. l|B34p is variationally optimized 
with respect to r, the optimal r is found to be constant. Since this only occurs when A = A s , this result suggests 
that, compared with unidirectional data, protocol processin g less likely to be useful for treating bidirectional data. 



